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ABSTRACT: In the present study the improved architecture of Inverse Fast Fourier Transform (FFT) is developed 
and presented. The number of arithmetic operation is more in the normal working of conventional Inverse Fast 
Fourier Transform. An enhanced pruning algorithm is utilized to reduce the number of arithmetic operations in the 
FFT architecture. The performance of the improved FFT architecture is estimated to find its suitability for the low 
power Wireless communication system. It was implemented in 8-point FFT architecture using decimation in 
frequency algorithm using hardware description language. It is implemented in XC7z020clg484-1 from Zynq-7000 
family with a frequency of 220MHz. It is found that the improved FFT architecture reduces maximum of 40% of the 
arithmetic operations, which reduces the power consumption by maximum of 10%. Hence, the improved FFT 
architecture could be used in the signal processing units in wireless applications. 
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Enhanced Pruned Algorithm (EPA), Field-programmable gate array (FPGA), Quadrature amplitude modulation 


(QAM). 


https://doi.org/10.29294/IJASE.7.2.2020.1770-1775 


1. INTRODUCTION 

The developments in the area of wireless 
communication technology are introducing more 
challenges and opportunities for the researchers in the 
recent days. The continuous growth of internet traffic 
which demands for high transmission rates motivates 
for the design of various wireless networks. Offset 
Quadrature Amplitude Modulation (OQAM) transmitter 
is found to play a dominant role in the wireless 
communication system to meet the challenges of the 
future systems. This paper aims to reduce the power 
and complexity of Inverse Fast Fourier Transform 
block at wireless communication system. The 
advancements in VLSI design and signal processing 
enhances the energy efficiency and reduction in the 
design complexity in the transmitter. Razavi et. al.[1] 
reported the capacity of OFDM/OQAM evaluated by 
isotropic orthogonal transfer method through 
information theoretic analysis. The spectral efficiency 
of developed OFDM/OQAM and conventional OFDM 
were evaluated. Complexity of the system has to be 
decreased. Chuanxue et al [2] presented architecture of 
conjugated transmission system for FBMC/OQAM. The 
frequency and time prototype filter for Offset 
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Quadrature Amplitude Modulation with better spectral 
shape and mobility support was reported by Jeremy 
Nadal et.al [3]. This paper presents a_ better 
performance of FBMC/OQAM for obtaining linear 
combination diversity. 


Moreover, Guillem et al [4] and Jeong-Min Choe et.al 
[5] explained the comparison of modulation schemes 
used for the modern wireless communication system. 
The Channel estimation and improved spectral 
efficiency is also presented. Trung-Hien Nguyen et al 
[6] presented the frequency sampling equalizer used 
for the chromatic dispersion compensation where an 
adaptive maximum likelihood estimator was utilized 
for the phase noise compensation. The trade-off 
analysis was also made between chromatic dispersion 
and phase noise compensation. Amir Aminjavaheri et 
al [7] studied an asymptotic study of the performance 
of filter bank multicarrier in the context of massive 
multi-input multi-output. An efficient equalization 
method presented for the arbitrarily large signal to 
interference plus noise ratio values by increasing the 
number of base stations (BS) antennas. The design of 
multiple input multiple output de-coding and _ pre- 
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coding matrices for Filter Bank Multicarrier System 
was reported by Marius Caus and Perez-Neira[8]. The 
bit error rate was improved than the existing MIMO- 
FBMC/OQAM Scheme. The decomposition design of 
complex training sequence for an efficient estimation 
of channel in Multiple-Input Multiple-Output 
multicarrier communications using OQAM_ was 
reported by Su Hu et al [9]. The Simulation results 
were validated by Uma Maheswari et.al.[10] presented 
the analysis of multicarrier communication system 
with non-linear power amplifiers for modern wireless 
communication system. 


Rodolfo Gomes et.al [11] reported the performance 
analysis of FS-FBMC versus OFDM for better data rate 
applications. An _ orthogonal frequency division 
multiplexing (OFDM) model was designed to 
standardize for an JEEE standard  data-rate 
applications. Martin and Peissing [12] studied the 
capacity gains for Single and Multiuser channel 
adaptive waveforms in FBMC/OQAM systems. The cell 
and user-specific channel adaptive waveforms 
provided a low-complexity system design, which was 
capable of fighting the interference problem for 
systems without a cyclic prefix. Jeremy et al [13] 
explained a complexity reduced architecture for 
FBMC/OQAM_ Transmitter. A transmitter with 
pipelined design is able to support different type of 
filter length aims less complexity in design. 


Hyungju Nam et.al [14] reported a multicarrier 
system for QAM transmission and reception with two 
prototype filters. The bit error rate and signal-to- 
interference power ratio were evaluated for 
FBMC/QAM. Ji-Hyun Moon et.al [15] explained Peak-to- 
Average-Power-Ratio reduction in the FBMC-OQAM 
System. Reduction of power in FBMC/OQAM was 
numerically evaluated using simulations. Han Wang 
et.al [16] presented Hybrid Peak to Average Power 
Reduction Scheme for FRMC/OQAM. Multi data scheme 
provided 0.2dB peak to average power reduction 
performance than hybrid power transmit sequence 
transmit sequence in the simulation results. The VLSI 
architecture for fast Fourier transforms (FFT) 
processor capable of generating normal output order 
sequence reported by Yun-Nan Chang [17]. The normal 
output order sequence was generated by a sequence 
conversion method. The power efficient hardware was 
generated with less number of adders. Power and 
complexity need to be decreased. Madheswaran et.al 
[18] reported the power efficient modified FFT for 
wireless transmitter. The architecture was designed 
and verified using the hardware description language. 
Nguyena et al, Hu et al [19,20] explained the designs 
for the FFTs and FFTs with high speed, better 
complexity and improved utilization of resources for 
wireless communication transmitters. 
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Zhong Hu and Honghui Wan explained how to save 
power consumption and complexity in computation by 
a traced FFT pruning [20]. This input pruning TFFTP to 
biological sequence is aligned. Moreover, a study of an 
FFT core for various wireless modems designed and 
implemented. The optimization and conversion of C 
code to hardware is analyzed. The resulting hardware 
is optimized to IP core available in Xilinx FPGA in terms 
of hardware requirements. This present work explains 
the default implementation by Vivado TM HLS which 
minimizes the resource consumption. It gives high 
latency and minimal parallelism which generates good 
results, but need more improvement in power 
consumption. 


It has been reported an OFDM transceiver chip is 
designed by HDL. This work used with wireless 
communication system for high speed operations. The 
high band width capabilities of OFDM have an 
advantage on wireless products with many types of 
networking systems. Moreover, the modulator for the 
Filter Bank Multi Carrier system showed less 
complexity than the traditional one. FFTs implemented 
for the FBMC modulator is a pruned circuit. It gives the 
advantages to the complex conjugation relations 
formulating the outputs for the FFT stage. Relations 
depend on the delay parameters. When it is even 
values, FFT will take the real value as the input. For 
odd values, there is complex conjugation relation 
between even and odd indices of the FFT output. 
Complexity is higher in this presented method. 
Multiplications in FFT are replaced by pass logic in 
wireless applications is presented recently. It reduces 
power by 6.5% for 64 point FFT and is implemented in 
VLSI. Minimum bit resolution map of FFT/FFT 
architectures are reported. The high accuracy and 
validity of identified bit resolution map _ is 
experimentally verified in FPGA based platforms. 


1. Enhanced Pruned Algorithm (EPA) for FFT 


FBMC/OQAM transmitter diagram is shown in 
Fig.1. This method of transmission is able to achieve 
full time/ frequency efficiency through the use of Offset 
Quadrature Amplitude Modulation. Maximum QAM 
samples are modulated based on the number of active 
sub-carriers and unused carriers are set to zero. PPN is 
poly phase network. Polyphase network re-generates 
the odd samples from the computed even samples. 


Xoq(m) Urn(k) 


Binary || OQAM 
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Fig.1 Block diagram of OQAM 
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To reduce the operational complexity, the output 
indices of the enhanced pruned algorithm with FFT are 
applied together with a decimation-in time (DIT) 
algorithm. The basic computation part of FFT was done 
by butterfly method. These arithmetic operators are 
used for complex computation and twiddle factor 
calculation. A general N point FFT computes O(N log N) 
number of operations. 

The conventional FFT algorithms were not an 
efficient method for unwanted operations on zero 
input values. In general, the system demands operation 
with high speed and low power consumption for all 
combination of inputs. Efficiency of the Inverse Fast 
Fourier Transform technique can be increased by 
designing the conventional FFT with enhanced pruning 
algorithm which is proposed in this paper. The speed of 
the FFT is increased by removing few operations of 
output whose input and output values are zero. Those 
operations are deprecated for the corresponding 
inputs in the process of pruning. These pruning 
techniques were added to filter bank system in OQAM 
transmitter for reducing the operating power and 
complexity of the system. 


The N-point Inverse Fast Fourier Transform with 
O<n<WN-—lis given by 
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The equation (2) can be decomposed into two sums: 
N/2-1 1 N/2-1 
x(n) = — = X (Qmw,?"" gee ¥ X(2m+ pwn) 


m=0 m=0 


Butterfly unit with radix 2 FFT is shown in Fig.2 


A and B are the inputs and X and Y are outputs of 2- 
radix FFT algorithm. The twiddle factor is Wn. 
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The outputs are, 
X= A+BWn (5) 
Y=B-BWn (6) 


Tariq Jamil [21] reported the conversion of complex 
numbers to binary number system for the operation 
and combining of real and imaginary numbers to 
complex numbers for various operations. 


Equation for the total power consumption in terms 
of dynamic component and static component is given 
in the equation (7). 


A X=A+BWn 
ee +a > 
a“ 
a 


ge 


B at ~ Y=A-BWn 
*_-@<—— >--©-—F 


Wh 
Fig.2 Butterfly diagram of radix-2 FFT 
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C,, load capacitance, V,,,. supply voltage, 1, leakage 


current, f. is the clock frequency and ar is the 


switching probability called activity ratio. Dynamic 
power dissipation occurs only when the circuit is in the 
working mode. i.e. the circuit is executing some task on 
a set of data. In this paper, an enhanced pruning 
method is proposed to reduce the dynamic power by 
reducing the number of switching operations. 


Number of adders used in each stage of computation 
is 4. Number of subtractors and multipliers also used in 
the each stage is 4. Total number of adders, subtractors 
and multipliers in total stages of operation is 12. 
Maximum number of computations can be performed 
by one adder is 4. Number of computations can be 
performed by the four adders in the first stage of 
computation is 16. Number of operations can be 
performed by the subtractor and multiplier in each 
stage is also 16. Maximum number of addition 
operation in the whole 8 point FFT system is 48. In 
partial pruning algorithm, whenever the inputs and 
outputs go to zero, operations for the respective inputs 
were omitted [20]. The total number of complex 
multiplication and additions are reduced by this partial 
pruning method. In partial pruning method, number of 
addition operations is reduced to 12 from actual 16. 
Likewise number of multiplications and subtractions 
are also reduced to 12 in each stage. The total number 
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of operations in 8 point FFT is reduced to 108 from 
144. Maximum 36 number of operation are avoided 
based on the partial pruning method. By omitting the 
number of operations, power consumption also 
reduced. 


Fig. 3 Flow Chart of Enhanced Pruned Algorithm for 
Subtractor operator in FFT 


2. RESULTS & DISCUSSION 


Number of operations in addition, subtraction and 
multiplication is explained in three stages of operation 
using butterfly diagram. 16 numbers of operations 
were performed for addition, subtraction and 
multiplication in conventional FFT in the first stage. 
Likewise total of 48 operations are performed in each 
stage of FFT. Total of 144 numbers of operations are 
performed in the conventional FFT. In partial pruning 
operation 48 numbers of operations in each stage is 
reduced to 36. So, total operations in three stages are 
reduced to 108. 40 numbers of operations are reduced 
in partial pruning FFT operation than conventional 
FFT. Number of operations in the Enhanced pruned 
FFT is reduced to 32 from 48 during the first stage of 
operation than conventional FFT. Number of 
operations are reduced to 32 in the remaining stages of 
Enhanced pruned FFT. Total number of operation in 
the enhanced FFT is reduced to 96. It is the minimum 
number of operations in the FFT with enhanced pruned 
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algorithm. Comparison between the different number 
of operations of proposed technique with conventional 
and partial pruned FFT in different order FFT is given 
in Table 1. Total of 144 operations are performed in 8 
point FFT. 


Table 1 Comparison of number of operations in 
various 8-point FFT 


Additions 16 12 12 
Stage 1 | Subtractions 16 12 8 
Multiplications 16 12 12 
Additions 16 12 12 
Stage 2 | Subtractions 16 12 8 
Multiplications 16 12 12 
Additions 16 12 12 

Stage3 | Subtractions | 16 | 12 | 8 | 
Multiplications 16 12 12 


Number of operation in the partial pruned FFT in 16 
point FFT is 288. An operation in the proposed 
technique is reduced to 248. 136 operations are 
reduced in the proposed technique than conventional 
FFT in 16 point FFT. Maximum of 40% operations are 
reduced in the proposed technique than conventional 
FFT in all order of FFT. Power consumption for the 
proposed algorithm with the FFT is 78mW. 


The architecture of the Enhanced pruned FFTs are 
designed in hardware description language and 
synthesized in the XC7z020clg484-1 from Zynq-7000 
family. Synthesis results are given in Table 6. These 
results are based on the FFT size of 512. The 
complexity of the proposed design is relatively 
inconsiderable when the proposed FFT size is 
increased. Overall power reduction is done by 
proposed FFT with conventional FFT. 
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The transmitters are implemented on the Zed-Board 
using the method given in [11]. Power consumption 
generated for the proposed technique is given in the 
table-6. Power consumed for the conventional FFT was 
88mW [13]. Proposed logic generates very less 
complexity with utilization compared to the last 
technique which was developed for the FFT in 
FBMC/OQAM transmitter. 


Table 2 Comparison of Power Consumption 


Transmitter Architecture Bower 
(mw) 
88 
Conventional FFT [13] 
FFT using Enhanced Pruned Algorithm 78 
(Proposed) 


Output wave form for the proposed algorithm with 8 
point FFT is given in the Fig.4, which is generated by 
the XC7z020clg484-1 device for the proposed 
algorithm. The proposed algorithm is implemented 
using hardware description language. So, these wave 
form values are generated after less number of 
addition, subtraction and multiplication operations in 
the device. Overall power consumption will be reduced 
in the FBMC/OQAM transmitter by this FFT with the 
enhanced pruned algorithm. Speed of the system is also 
increased by reducing the propagation delay between 
the blocks by bypassing or omitting the number of 
operators by the enhanced pruned algorithm used in 
the FFT design. 


Fig.4. Output Waveform of 8-point FFT using 
Enhanced pruned algorithm 
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3. CONCLUSION 


In this work, an enhanced pruned algorithm for the 
FPGA implementation of Inverse Fast Fourier 
Transform is proposed. These enhanced pruned FFTs 
reduce the complexity in the architecture. Number of 
arithmetical operators in FFTs was reduced by 
maximum of 40% using this enhanced pruned 
algorithm. For comparison purpose, conventional FFT 
and some pruned FFTs have been considered from the 
previous work. Analytical studies of FFTs using these 
algorithms were demonstrated with reduction in 
complexity and power consumption. Analytical and 
implementation results for the proposed architecture 
have demonstrated the reduction in complexity and 
power consumption which are compared. Behavioral 
level of design leads to minimize the complexity and 
increases the speed of operation. Power consumption 
of proposed algorithm is reduced by a maximum of 
10% of the conventional FFT design. These algorithms 
were implemented using VHDL in XC7z020clg484-1 
from Zynq-7000 family with a clock frequency of 
220MHz. Power reports and utilization statements are 
generated by implementing in the targeting device 
XC7z020clg484-1 from Zynq-7000 family. 
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